Interfacing Speech Recognition and Vision Guided

نویسندگان

Vibhav Shyam Rangarajan

Shyam Rangarajan

چکیده

One goal of a pervasive computing environment is to allow the user to interact with the environment in an easy and natural manner. The use of spoken commands, as inputs to a speech recognition system, is one such way to naturally interact with the environment. In challenging acoustic environments, microphone arrays can improve the quality of the input audio signal by beamforming, or steering, to the location of the speaker of interest. The existence of multiple speakers, large interfering signals and/or reverberations or reflections in the audio signal(s) requires the use of advanced beamforming techniques which attempt to separate the target audio from the mixed signal received at the microphone array. In this thesis I present and evaluate a method of modeling reverberations as separate anechoic interfering sources emanating from fixed locations. This acoustic modelling technique allows for tracking of acoustic changes in the environment, such as those caused by speaker motion. Thesis Supervisor: Trevor Darrell Title: Associate Professor

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Interfacing Sound Stream Segregation to Recognition - Preliminar Several Sounds Si

This paper reports the preliminary results of experiments on listening to several sounds at once. ‘Ike issues are addressed: segregating speech streams from a mixture of sounds, and interfacing speech stream segregation with automatic speech recognition (AD). Speech stream segregation (SSS) is modeled as a process of extracting harmonic fragments, grouping these extracted harmonic fragments, an...

متن کامل

Classification of the Spoken Hindi Partially Reduplicated Words using Artificial Neural Network

The most ordinary way of information exchange is Speech. It provides an efficient way of man-machine communication using speech interfacing. Speech interfacing involves two process, speech synthesis and speech recognition. Speech recognition allows a computer to identify the words that a person speaks to a microphone or telephone. The two main mechanism, used in speech recognition, are signal p...

متن کامل

A new speech enhancement: speech stream segregation

Speech stream segregation is presented as a new speech enhancement for automatic speech recognition. Two issues are addressed: speech stream segregation from a mixture of sounds, and interfacing speech stream segregation with automatic speech recognition. Speech stream segregation is modeled as a process of extracting harmonic fragments, grouping these extracted harmonic fragments, and substitu...

متن کامل

Interfacing acoustic models with natural language processing systems

The research presented here focuses on implementation and efficiency issues associated with the use of word graphs for interfacing acoustic speech recognition systems with natural language processing systems. The effectiveness of various pruning methods for graph construction is examined, as well as techniques for word graph compression. In addition, the word graph representation is compared to...

متن کامل

A Generic and Visual Interfacing Framework for Bridging the Interface between Application Systems and Recognizers

Application systems that utilize recognition technologies such as speech, gesture, and color recognition provide human-machine interfacing to those users that are physically unable to interact with computers through traditional input devices such as mouse or keyboard. Current solutions to interface application systems with recognizers, however, use an ad hoc approach and lack of a generic and s...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Interfacing Speech Recognition and Vision Guided

نویسندگان

چکیده

منابع مشابه

Interfacing Sound Stream Segregation to Recognition - Preliminar Several Sounds Si

Classification of the Spoken Hindi Partially Reduplicated Words using Artificial Neural Network

A new speech enhancement: speech stream segregation

Interfacing acoustic models with natural language processing systems

A Generic and Visual Interfacing Framework for Bridging the Interface between Application Systems and Recognizers

عنوان ژورنال:

اشتراک گذاری